home *** CD-ROM | disk | FTP | other *** search
- Path: cs.utexas.edu!not-for-mail
- From: wilson@cs.utexas.edu (Paul Wilson)
- Newsgroups: comp.lang.lisp,comp.lang.c++,comp.lang.c,comp.lang.misc,comp.theory
- Subject: allocator studies (was Re: GC & traditional allocators & textbooks)
- Date: 5 Feb 1996 09:56:51 -0600
- Organization: CS Dept, University of Texas at Austin
- Distribution: inet
- Message-ID: <4f59c3$7il@jive.cs.utexas.edu>
- References: <rvillDL4v3n.I8r@netcom.com> <823365355snz@wildcard.demon.co.uk> <4f2ila$6p8@jive.cs.utexas.edu> <823455623snz@wildcard.demon.co.uk>
- NNTP-Posting-Host: jive.cs.utexas.edu
-
- In article <823455623snz@wildcard.demon.co.uk>,
- Cyber Surfer <cyber_surfer@wildcard.demon.co.uk> wrote:
- >In article <4f2ila$6p8@jive.cs.utexas.edu>
- > wilson@cs.utexas.edu "Paul Wilson" writes:
- >
- >> The history of allocator
- >> research has been a big mess---the literature is a bit of a disaster
- >> area---and the textbooks reflect this. The analyses in the books are
- >> shallow and largely wrong. (This is less attributable to the textbooks
- >> authors than the weak discussions of GC. It's not their fault that they
- >> can't summarize the gist of the literature and get it right, because
- >> the literature in general is disorganized, inconsistent, and often wrong.)
- >
- >I won't argue with that! ;-) I'd love to see a good summary.
-
- Well, very briefly:
-
- 1. Traditional allocator research has largely missed the point, because
- the true causes of fragmentation haven't been studied much. Most
- studies (a recent exception being Zorn, Grunwald, and Barrett's
- work at the U. of Colorado at Boulder, and Phong Vo's latest paper
- from Bell Labs) have used synthetic traces whose realism has never
- been validated, rather than real program behavior.
-
- 2. The standard synthetic traces of "program" behavior are unrealistic,
- because they're based on pseudo-random sequences, even if the
- object size and lifetime distributions are realistic. For good
- allocator policies, this causes an unrealistically high degree
- of fragmentation, because you're shooting holes in the heap at
- random. For some programs and some allocators, though, you get
- unrealistically low fragmentation. In general, the random trace
- methodology makes most allocators' performance look pretty similar,
- when in fact the regularities in real traces make some allocators
- work *much* better than others.
-
- 3. People have often focused on the speed of allocator mechanisms,
- at the expense of studying the effects of policy in a realistic
- context. As it turns out, some of the best policies are amenable
- to fast implementations if you just think about it as a normal
- data-structures-and-algorithms problem. The best policies have
- often been neglected because people confused policy and mechanism,
- and the best-known mechanisms were slow.
-
- 4. People have usually failed to separate out "true" fragmentation
- costs---due to the allocator's placement policy and its interactions
- with real program behavior---from simple overheads of simplistic
- mechanisms. Straightforward tweaks can reduce these costs easily.
-
- 5. The best-known policies (best fit and address-ordered first fit)
- work better than anyone ever knew, while some policies (Knuth's
- "Modified First Fit" or "Next Fit") work worse than anyone ever
- knew, for some programs. This is because randomized traces tend
- probabilistically to ensure that certain important realistic
- program behaviors will never happen in traditional experiments.
-
- 6. None of the analytical results in the literature (beginning with
- Knuth's "fifty percent rule") are valid. They're generally based
- on two or three assumptions that are all systematically false for
- most real programs. (Random allocation order, steady-state behavior,
- and independence of sizes and lifetimes. To make mathematical
- analyses tractable, people have sometimes made stronger and even
- more systematically false assumptions, like exponentially-distributed
- random lifetimes.)
-
- 7. Because of these problems, most allocator research over the last
- 35 years has been a big waste of time, and we still don't know
- much more than we knew in 1968, and almost nothing we didn't
- know in the mid-70's.
-
- > [...]
- >> "Modified First Fit" with the roving pointer is the clearest example. It
- >> was a bad idea, and it was quickly shown to be bad, but some other textbook
- >> writers keep mindlessly cribbing from Knuth, and even some implementors still
- >> use it.
- >
- >Is it really worse than Best Fit? I've wondered about that ever
- >since first read that book. You seem like a good person to ask. ;)
-
- Yes, it is significantly worse, and some versions are often far worse, even
- pathological. If you keep the free list in address order, it's just worse.
- If you keep the free list in LIFO order, it can be pathological. (LIFO
- order is attractive because it's cheaper than address order, and if all
- other things were equal, it would improve locality.) For some programs,
- it systematically takes little bites out of big free blocks, making those
- blocks unusable for allocation requests of the same big size in the future.
- Many traditional experiments have masked this effect by using smooth
- distributions of object sizes, so that odd-sized blocks are less of a
- problem.
-
- (Given random sizes, you're likely to have a request that's a good fit
- for the block size pretty soon, even after taking a bite out of it. For
- real traces, you're likely *not* to get any requests for
- comparable-but-smaller sizes anytime soon. And the randomization of
- order tends to eliminate the potential pathological interactions due
- to particular sequence behavior. You eliminate the extreme cases.)
-
- >> Obligatory positive comment: the best textbook discussion of allocators
- >> that I know of is Tim Standish's in _Data_Structure_Techniques_. He doesn't
- >> recognize the near-universal methodological problems with allocator studies,
- >> but he's unique in recognizing the basic data structure and algorithm issues
- >> in implementing allocators.
- >
- >I'll see if I can find that book. Thanks.
- >
- >> [Our] site [http://www.cs.utexas.edu/users/oops/] also has our big survey
- >> on memory allocators from IWMM '95, which I hope will influence future
- >> textbooks. It talks about empirical methodology as well as giving a
- >> fairly exhaustive treatment of implementation techniques.
-
- --
- | Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wilson@cs.utexas.edu)
- | Papers on memory allocators, garbage collection, memory hierarchies,
- | persistence and Scheme interpreters and compilers available via ftp from
- | ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)
-